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Abstract 

In this paper, we propose a new coding scheme for the general relay channel. This 
c/3 \ coding scheme is in the form of a block Markov code. The transmitter uses a superpo- 

sition Markov code. The relay compresses the received signal and maps the compressed 
version of the received signal into a codeword conditioned on the codeword of the previ- 
, ous block. The receiver performs joint decoding after it has received all of the B blocks. 

. We show that this coding scheme can be viewed as a generalization of the well-known 

. Compress- And-Forward (CAF) scheme proposed by Cover and El Gamal. Our coding 

scheme provides options for preserving the correlation between the channel inputs of 
. the transmitter and the relay, which is not possible in the CAF scheme. Thus, our 

<^ • proposed scheme may potentially yield a larger achievable rate than the CAF scheme. 

> 
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1 Introduction 



As the simplest model for cooperative communications, relay channel has attracted plenty of 
attention since 1971, when it was first introduced by van der Meulen [1]. In 1979, Cover and 
El Gamal proposed two major coding schemes for the relay channel [2]. These two schemes 
are widely known as Decode-And-Forward (DAF) and Compress-And-Forward (CAF) to- 
day; see [3] for a recent review. These two coding schemes represent two different types 
of cooperation. In DAF, the cooperation is relatively obvious, where the relay decodes the 
message from the transmitter, and the transmitter and the relay cooperatively transmit the 
constructed common information to the receiver in the next block. In CAF, the cooperation 
spirit is less easy to recognize, as the message is sent by the transmitter only once. How- 
ever, the relay cooperates with the transmitter by compressing and sending its signal to the 
receiver. The rate gains in these achievable schemes are due to the fact that, through the 
channel from the transmitter to the relay, correlation is created between the transmitter and 
the relay, and this correlation is utilized to improve the rates. 

In the DAF scheme, correlation is created and then utihzed in a block Markov coding 
structure. More specifically, a full correlation is created by decoding the message fully at the 
relay, which enables the transmitter and the relay to create any kind of joint distribution 
for the channel inputs in the next block. The shortcoming of the DAF scheme is that by 
forcing the relay to decode the message in its entirety, it limits the overall achievable rate 
by the rate from the transmitter to the relay. In contrast, by not forcing a full decoding at 
the relay, the CAF scheme does not limit the overall rate by the rate from the transmitter 
to the relay, and may yield higher overall rates. The shortcoming of the CAF scheme, on 
the other hand, is that the correlation offered by the block coding structure is not utilized 
effectively, since in each block the channel inputs X and Xi from the transmitter and the 
relay are independent, as the transmitter sends the message only once. 

However, the essence of good coding schemes in multi-user systems with correlated sources 
(e.g., [4,5]) is to preserve the correlation of the sources in the channel inputs. Motivated 
by this basic observation, in this paper, we propose a new coding scheme for the relay 
channel, that is based on the idea of preserving the correlation in the channel inputs from 
the transmitter and the relay. We will show that our new coding scheme may be viewed 
as a more general version of the CAF scheme, and therefore, our new coding scheme may 
potentially yield larger rates than the CAF scheme. Our proposed scheme can be further 
combined with the DAF scheme to yield rates that are potentially larger than those offered 
by both DAF and CAF schemes, similar in spirit to [2, Theorem 7]. 

Our new achievability scheme for the relay channel may be viewed as a variation of the 
coding scheme of Ahlswede and Han [5] for the multiple access channel with a correlated 
helper. In our work, we view the relay as the helper because the receiver does not need to 
decode the information sent by the relay. Also, we note that the relay is a correlated helper 
as the communication channel from the transmitter to the relay provides relay for free a 
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correlated version of the signal sent by the transmitter. The key aspects of the Ahlswede- 
Han [5] scheme are: to preserve the correlation between the channel inputs of the transmitter 
and the helper (relay), and for the receiver to decode a "virtual" source, a compressed version 
of the helper, but not the entire signal of the helper. 

Our new coding scheme is in the form of block Markov coding. The transmitter uses 
a superposition Markov code, similar to the one used in the DAF scheme [2], except in 
the random codcbook generation stage, a method similar to the one in [4] is used in order 
to preserve the correlation between the blocks. Thus, in each block, the fresh information 
message is mapped into a codeword conditioned on the codeword of the previous block. 
Therefore, the overall codcbook at the transmitter has a tree structure, where the codewords 
in block / emanate from the codewords in block / — I. The depth of the tree is B — 1. A similar 
strategy is applied at the relay side where the compressed version of the received signal is 
mapped into a two-block-long codeword conditioned on the codeword of the previous block. 
Therefore, the overall codcbook at the relay has a tree structure as well. As a result of 
this coding strategy, we successfully preserve the correlation between the channel inputs of 
the transmitter and the relay. However, unhke the DAF scheme where a full correlation 
is acquired through decoding at the relay, our scheme provides only a partially correlated 
helper at the relay by not trying to decode the transmitter's signal fully. From [4,5], we note 
that the channel inputs are correlated through the virtual sources in our case, and therefore, 
the channel inputs between the consecutive blocks are correlated. This correlation between 
the blocks will surely hurt the achievable rate. The correlation between the blocks is the 
price we pay for preserving the correlation between the channel inputs of the transmitter 
and the relay within any given block. 

At the decoding stage, we perform joint decoding for the entire B blocks after all of the B 
blocks have been received, which is different compared with the DAF and CAF schemes. The 
reason for performing joint decoding at the receiver is that due to the correlation between 
the blocks, decoding at any time before the end of all the B blocks would decrease the 
achievable rate. We note that joint decoding increases the decoding complexity and the delay 
as compared to DAF and CAF, though neither of these is a major concern in an information 
theoretic context. The only problem with the joint decoding strategy is that it makes 
the analysis difficult as it requires the evaluation of some mutual information expressions 
involving the joint probability distributions of up to B blocks of codes, where B is very large. 

The analysis of the error events provides us three conditions containing mutual informa- 
tion expressions involving infinite letters of the underlying random process. Evaluation of 
these mutual information expressions is very difficult, if not impossible. To obtain a com- 
putable result, we lower bound these mutual informations by noting some Markov structure 
in the underlying random process. This operation gives us three conditions to be satisfied 
by the achievable rates. These conditions involve eleven variables, the two channel inputs 
from the transmitter and the relay, the two channel outputs at the relay and the receiver 
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and the compressed version of the channel output at the relay, in two consecutive blocks, 
and the channel input from the transmitter in the previous block. 

We finish our analysis by revisiting the CAF scheme. We develop an equivalent repre- 
sentation for the achievable rates given in [2] for the CAF scheme. We then show that this 
equivalent representation for the achievable rates for the CAF scheme is a special case of the 
achievable rates in our new coding scheme, which is obtained by a special selection of the 
eleven variables mentioned above. We therefore conclude that our proposed coding scheme 
yields potentially larger rates than the CAF scheme. More importantly, our new coding 
scheme creates more possibilities, and therefore a spectrum of new achievable schemes for 
the relay channel through the selection of the underlying probability distribution, and yields 
the well-known CAF scheme as a special case, corresponding to a particular selection of the 
underlying probability distribution. 

2 The Relay Channel 

Consider a relay channel with finite input alphabets X, X\ and finite output alphabets 3^, 
3^1, characterized by the transition probability ^p(\j^y^\x^X\). An n-length block code for the 
relay channel |/i|a;, X\) consists of encoders f,fi,i = l,...,n and a decoder g 
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where the encoder at the transmitter sends = firn) into the channel, where m E A4 = 
{1, 2, . . . , M}; the encoder at the relay at the ith channel instance sends xu = fi{y\~^) into 
the channel; the decoder outputs m = g{y^)- The average probability of error is defined as 

Pe = -Pr(m 7^ m\m is transmitted) (1) 

A rate R is achievable for the relay channel |/i|a;, Xi) if for every < e < 1, > 0, 
and every sufficiently large n, there exists an n-length block code (/, /j, g) with Pe < e and 
ilnM > R-ri. 

n — ' 

3 A New Achievability Scheme for the Relay Channel 

We adopt a block Markov coding scheme, similar to the DAF and CAF schemes. We have 
overall B blocks. In each block, we transmit codewords of length n. We denote the variables 
in the Ith block with a subscript of [/]. We denote n-letter codewords transmitted in each 
block with a superscript of n. Following the standard relay channel literature, we denote 
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the (random) signals transmitted by the transmitter and the relay by X and Xi , the signals 
received at the receiver and the relay by Y and Yi, and the compressed version of Yi at the 
relay by Yi. The realizations of these random signals will be denoted by lower-case letters. 
For example, the n-letter signals transmitted by the transmitter and the relay in the /th 
block will be represented by xjjj and x'^^y 

Consider the following discrete time stationary Markov process Gyi-^ = {X,Yi, Xi,y,Yi)[i] 
for / = 0, 1, . . . , -B, with the transition probability distribution 

P {{x, yi, xi, y, yi)[i] \ {x, yi, xi, y, yi)[i-i]) 

= p{x[i] \x[i-i] )p{yi[i] , y[i] \xii] , xi[i] \yi[i-i])piyi[i] \yi[i] , xi[i] ) (2) 

The codebook generation and the encoding scheme for the /th block, / = !,.. .,5 — 1, are 
as follows. 

Random codebook generation: Let {x'^i_^{m[i^i]),x^i_^,y^i_^,y'^i_^) denote the trans- 
mitted and the received signals in the (/ — l)st block, where m^i^i] is the message sent by 
the transmitter in the (/ — l)st block. An illustration of the codebook structure is shown in 
Figure [H 

1. For each x|J_y(m[i_i]) sequence, generate M sequences, where x||](m[;]), the m^^jth se- 
quence, is generated independently according to nr=i Here, every code- 
word in the (/ — l)st block expands into a codebook in the /th block. This expansion 
is indicated by a directed cone from x]^i_^ to in Figure [H 

2. For each x^i_^ sequence, generate L Y^^ -^j sequences independently uniformly dis- 
tributed in the conditional strong typical sel0 Ts{x^i_^) with respect to the distribu- 
tion If i InL > /(Yi[i_i]; Yi[;_i]|Xi[i_i]), for any given y^i_^ sequence, 
there exists one yi[i_i] sequence with high probability when n is sufficiently large such 
that are jointly typical according to the probability distribution 
p{yi[i-i],yi[i^i],xi[i-i]). Denote this ^J'j,.!] as y'^[i_^{y'^[i_^y x'^[i_^). Here, the quantiza- 
tion from to parameterized by x^i_^, is indicated in Figure[T]by a directed 
cone from y^i^^ to yi[i_^, with a straight line from a;"[^_^] for the parameterization. 

3. For each yi[i_^, generate one x^^^ sequence according to YYi=iPixiiii]\yiiii-i])- This 
one-to-one mapping is indicated by a straight line between and x^^j in Figured! 

Encoding: Let m[;] be the message to be sent in this block. If (a;jj_-^j(m[i_i]), are sent 

and y"[i_^ is received in the previous block, we choose {x^^^m^i]) , y^^i_^^{y^^i_^y x^i_j^^) , x^^^) 
according to the code generation method described above and transmit (xjj](m[;]), In 

^Strong typical set and conditional strong typical set are defined in [6, Definition 1.2.8, 1.2.9]. For the 
sake of simplicity, we omit the subscript which is used to indicate the underlying distribution in [6] . 
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Figure 1: Codebook structure. 



the first block, we assume a virtual 0th block, where (x^j, a;"[Qj, as well as a;"^^, are 
known by the transmitter, the relay and the receiver. In the Sth block, the transmitter 
randomly generates one sequence according to Ilj'Li ^^"^ sends it into the 

channel. The relay, after receiving ?/"[^] , randomly generates one y\-yB\ sequence according to 
XXi=\V{yu\s^y\i\s\-,^\i\B'\)- We assume that the transmitter and the relay rehably transmit 
Xj^j and y\-yB\ receiver using the next h blocks, where h is some finite positive integer. 

We note that B + b blocks are used in our scheme, while only the first B — 1 blocks carry 
the message. Thus, the final achievable rate is -§^-lnM which converges to -InM for 

° ' B+b n n 

sufficiently large B since h is finite. 

Decoding: After receiving B blocks of sequences, i.e., •• • assuming x^^^, 

x\b] and yl^^^ are known at the receiver, we seek xj^j , . . . , xj^^.^j , , . . . , y^B_^ , x^^] > • • • > ^i[b] > 
such that 

■ ■ ■ 1 ^[B]iyi[iY • • • ' yi[B]i^i[i]i ■ ■ - I ^i[B]iy[i]i • • • ^ yfB]) ^ ^ 

according to the stationary distribution of the Markov process G^-\ in ([2]). 

The differences between our scheme and the CAF scheme are as follows. At the transmit- 
ter side, in our scheme, the fresh message m^-\ is mapped into the codeword x||j conditioned on 
the codeword of the previous block X[J_j^], while in the CAF scheme, m^-\ is mapped into x||j, 
which is generated independent of x|J_^|. At the relay side, in our scheme, the compressed 
received signal is mapped into the codeword x"^^], which is generated according to 

p(xi[i] while in the CAF scheme, x^^] is generated independent of The aim 

of our design is to preserve the correlation built in the (/ — l)st block in the channel inputs 
of the /th block. At the decoding stage, we perform joint decoding for the entire B blocks 
after all of the B blocks have been received, while in the CAF scheme, the decoding of the 
message of the {I — l)st block is performed at the end of the /th block. 

Probability of error: When n is sufficiently large, the probability of error can be made 
arbitrarily small when the following conditions are satisfied. 
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1. For all j such that 1 < j < 5 — 1, 

i(5 - j) InM + (5 - j)I{YmY, 

< Hxl^f'\yiu]~'\4u^i]^yS\y^^^^^ (3) 

2. For all j, k such that l<j</i:<-B-l, 

3. For all j, k such that l<k<j<B — 1, 

U - k)I{Yiiiy,Ym]\X,[i],Xii]) + - j) InM + (5 - j)I{YmY, Y^n] {X^], X[,]) 

where the subscript [/] on the left hand sides of ([3]), (jl]) and (jS]) indicates that the correspond- 
ing random variables belong to a generic sample 5f[;] of the underlying random process in ([2]). 
The details of the calculation of the probability of error where these conditions are obtained 
can be found in Appendix lA.li The derivation uses standard techniques from information 
theory, such as counting error events, etc. 

In the above conditions, we used the notation Ay_^ as a shorthand to denote the sequence 
of random variables Ayj, ^[j+i], . . . , ^[b]- Consequently, we note that the mutual informations 
on the right hand sides of ([3]), (jl]) and ([5]) contain vectors of random variables whose lengths 
go up to -B, where B is very large. In order to simplify the conditions in ([3]), (jlj) and ([5]), 
we lower bound the mutual information expressions on the right hand sides of ([2]), (jlD and 
^ by those that involve random variables that belong to up to three blocks. The detailed 
derivation of the following lower bounding operation can be found in Appendix IA.2[ The 
derivation uses standard techniques from information theory, such as the chain rule of mutual 
information, and exploiting the Markov structure of the involved random variables. 

1. For all j such that 1 < j < -B — 1, 

{B - j) Q InM + /(Fiw; FiH \X^[l], X^i^)^ 

< (5 - j)/(V'H;X[,],Fi[,],Xi[^]|X[,_2],Xi[^_i],yj,_i]) (6) 
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2. For all j, k such that l<j<k<B-l, 



{k - j)i In M + (5 - k) In M + /(Yi^; |Xi[q, X[,])^ 



< [k - j)/(X[/]; y[;], Yi[i]|Xi[/], y[/_i], Yi[;_i], X[/_2]) 
+ (5 - A;)/(r[,];X[,],ri[,],Xi[,]|X[^_2],Xi[;„i],r[,_i]) (7) 

3. For all j, /c such that l<k<j<B — 1, 

{j - k)IiY,[iy,Yi[i]\X^,],X[,^) + (B-j) (^^lnM + /(fi[,];Fi[,]|Xi[,],X[,]) 
< (j - A;)/(yj,];Fi[;],Xi[,]|X[i],X[,_i],Xi[i_i],F[i_i]) 

+ (5 - j)/(il^];X[,],fi[,],Xi[,]|X[,_2],Xi[,_i],r[,_i]) (8) 

We can further derive sufficient conditions for the above three conditions in ([6]), ([7]) and 
(IE]) as follows. We define the following quantities: 

Ci ^ ilnM + J(Fi[;];Fi[i]|Xi[,],X[^]) (9) 

C2 = -lnM (10) 

n 

C3^/(riH;riH|Xi[,],X[,]) (11) 

Di = I(Y[iY,X[i],Yi[i],Xi[i]\X[i_2],Xi[i_i],Y[i_i]) (12) 

D2 = /(X[i]; Yji], Fi[/]|Xi[i], y[i„i], Fi[/„i], Xi[/_i], X[i_2]) (13) 
D3 = I(YiiY,Yi[i], Xi[i]\Xii], X[i_i], Xi[i^i],Y[i^i]) (14) 

Then, the sufficient conditions in ([6]), ([7]) and ([8]) can also be written as, 

1. For all j such that 1 < j < B — 1, 

{B-j)C,<iB-j)D, (15) 

2. For all j, k such that l<j<A;<i?-l, 

{k - j)C2 + {B- k)C, <{k~ j)D2 + {B- k)D, (16) 

3. For all j, k such that l<k<j<B — 1, 

(j - k)Cs + {B- j)C, < (j - k)D3 + {B- j)D, (17) 
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We note that the above conditions are imphed by the following three conditions, 



Ci < D, (18) 

C2 < D2 (19) 

< Ds (20) 

or in other words, by, 

R-r]<^\nM< I{X[iy,Y[i],Yi[i] l^i^, ^^-i], Y,[i^i], X,[i^^, X[,_2]) (21) 

-^C^ih; '^i[«]l-^i[/],-^[/]) < -^('^[/];'^i[/],-^i[«]l-^[«],-^[«-i],-^i[/-i],'^«-i]) (22) 

R - T] + I{Yi[iy,Yi[i]\Xi[i], X[i]) < I{Y[iy, X[i],Yi[i], Xi[i]\X[i_2], Xi[i_i],Y[i_i]) (23) 



The expressions in ( l2Tll . (l22l) and ( |23l) give sufficient conditions to be satisfied by the rate 
in order for the probability of error to become arbitrarily close to zero. We note that these 
conditions depend on variables used in three consecutive blocks, 1,1 — 1 and / — 2. With 
this development, we obtain the main result of our paper which is stated in the following 
theorem. 

Theorem 1 The rate R is achievable for the relay channel, if the following conditions are 
satisfied 

R <I{Y, Fi; X|Xi, h F, Xi, 1) (24) 
/(fi;ri|Xi,X) </(r;fi,Xi|X,F,X,Xi) (25) 

i? + /(fi;ri|Xi,X) </(r;Fi,Xi,X|r,Xi,l) (26) 

where 

X ^ (x,Fi,Xi,F,yi) (x,yi,Xi,y,yi) (27) 

p(x, yi, xi, 2/1, x) = p{x, yi, xi, yi, x) (28) 
p(x, yi, xi, y, ^1, xi,y, yi) = p{x\x)p{xi\yi)p{yi, y\x, xi)p{yi\yi, xi) (29) 

In the above theorem, the notations ~ and ~ are used to denote the signals belonging to 
the previous block and the block before the previous block, respectively, with respect to a 
reference block. Therefore, we see that the achievable rate in the relay channel, using our 
proposed coding scheme, needs to satisfy three conditions that involve mutual information 
expressions calculated using eleven variables which satisfy the Markov chain constraint in 
fl27|) . the marginal distribution constraint in fl28l) . and the additional inter-block probability 
distribution constraint in fl29|) . 

In the next section, we will revisit the well-known CAF scheme proposed in [2]. First, we 
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will develop an equivalent representation for the well-known representation of the achievable 
rate in the CAF scheme. We will then show that the rates achievable by the CAF scheme can 
be achieved with our proposed scheme by choosing a certain special structure for the joint 
probability distribution of the eleven random variables in Theorem [1] while still satisfying 
the three conditions in (|271). (|28D and (|29D. 

4 Revisiting the Compress- And- Forward (CAF) Scheme 

In [2], the achievable rates for the CAF are characterized as in the following theorem. 

Theorem 2 ( [2]) The rate R is achievable for the relay channel, if the following conditions 
are satisfied 

R< I{X;Y,Yi\Xi) (30) 
/(Fi;Fi|Xi,r) </(Xi;F) (31) 

where 

p{x, xi, yi, ill) = p{x)p{xi)p{y, yi\x, Xi)p{yi\yi, Xi) (32) 

In the following theorem, we present three equivalent forms for the rate achievable by the 
CAF scheme. 

Theorem 3 The following three conditions are equivalent. 

1. For some p{x,xi,y,yi,yi) = p{x)p{xi)p{y,yi\x, Xi)p{yi\yi, Xi) 

i?-/(X;Fi|Xi) </(X;F|fi,Xi) (33) 
/(Fi; FilXi) < /(fi; F|Xi) + /(Xi; Y) (34) 

2. For some p{x, xi,y, yi, yi) = p{x)p{xi)p{y, yi\x, xi)p{yi\yi,xi) 

/?-/(X;fi|Xi) </(X;r|Fi,Xi) (35) 
R - /(X; FilXi) + I{Y^■ Y^\X^) < /(X, Y^; Y\X^) + /(Xi; Y) (36) 

3. For some p{x, xi, y, yi, yi) = p{x)p{xi)p{y , yi\x, xi)p{yi\yi,xi) 

/2-/(X;fi|Xi) </(X;r|fi,Xi) (37) 
/(fi; FilXi, X) < /(Fi; F|Xi, X) + /(Xi; Y\X) (38) 
R - /(X; FilXi) + /(Fi; %\X,) < /(X, Y^; Y\X^) + I{X,- Y) (39) 
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The proof of the above theorem is given in Appendix IA.3I 

We rewrite the final equivalent representation in (137|) . (138|) and (!39|) in the following more 
compact form in order to compare the rates achievable with our proposed scheme and the 
rates achievable with the CAF scheme in the next section. 

i?</(X;r,fi|Xi) (40) 
/(fi;ri|Xi,X) </(Fi,Xi;F|X) (41) 
R + IiY,;Y,\X,,X) < I{X,Y,,X,;Y) (42) 

5 Comparison of the Achievable Rates with Our Scheme 
and with the CAF Scheme 

We note that the conditions on the achievable rates with our scheme given in Theorem [H 
i.e., fl24l) . (l25l) . (!26|) . are very similar to the final equivalent form for the conditions on 
the achievable rates with the CAF scheme, i.e., (l40l) . (14T]) . fH2|) . except for two differences. 
First, the channel inputs of the transmitter and the relay, i.e., X and Xi, in our proposed 
scheme can be correlated, while in the CAF scheme they are independent, and second, in our 
scheme there are some extra random variables, which mutual information expressions are 
conditioned on, e.g., X , Xi,Y ,Yi, X . These two differences come from our coding scheme 
where we introduced correlation between the channel inputs of the transmitter and the relay 
in a block, and between the variables across the blocks. The correlation between the channel 
inputs from the transmitter and the relay in any block is an advantage, as for channels which 
favor correlation, this translates into higher rates. However, the correlation across the blocks 
is a disadvantage as it decreases the efficiency of transmission, and therefore the achievable 
rates. In fact, the price we pay for the correlation between the channel inputs in any given 
block is precisely the correlation we have created across the blocks. For a given correlation 
structure, it is not clear which of these two opposite effects will overcome the other. That is, 
the rate of our scheme for a certain correlated distribution may be lower or higher than the 
rate of the CAF scheme. However, we note that the CAF scheme can be viewed as a special 
case of our proposed scheme by choosing an independent distribution, i.e., by choosing the 
following conditional distribution in (!29|) 

p(x, yuxi,y, yi\x, yi, Xi,y, yi) = p{x)p{xi)p{yi, y\x, Xi)p{yi\xi,yi) (43) 

In this case, the expressions in Theorem [H i.e., (12^ . (p5|) . (!26|) . degenerate into the third 
equivalent form for the CAF scheme in Theorem [3l i.e., (HOl) . (14T]) . (142|) . The above observa- 
tion implies that the maximum achievable rate with our proposed scheme over all possible 
distributions is not less than the achievable rate of the CAF scheme. Thus, we can claim 
that this paper offers more choices in the achievability scheme than the CAF scheme, and 
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that these choices may potentially yield larger achievable rates than those offered by the 
CAF scheme. 



A Appendix 

A.l Probability of Error Calculation 

The average probability of decoding error can be expressed as follows, 

Pe = Pr{Ei U E2) = Pr{Ei) + Pr{E2 H E^) (44) 

where 

El — {xii,...,B]^yi[i,...,B]^^iii,...,B]^y[i,...,B]) ^ (45) 

u (^ri,...,B]Jr[i,...,i.],^i[i,...,i.],2/ri,...,B]) (46) 

(^fl,...,S]'2^r[l,...,S-ll)^(^fl,...,S]'S^l'[l,...,B-ll) 

where (x"^ B]^yi[i _b-i]'^i[2 b]) another codeword that is generated according to the 
rules of our scheme. 

From ([2]), we note the following Markov properties: 

1. conditioned on {Ym], X^, Xi[i]), Y[i] is independent of and 

2. conditioned on (X[/_i], Yi[i_i]), G^^,,,] is independent of ^[...^/-i]. 

Here, and in the sequel, subscript [/] refers to a generic block within overall B blocks. 
Pr{Ei) can be upper bounded as follows: 



PriEi) (Pr ((a;[l],a;^[,],y[?],i/rH,(?[^,„,_i]) ^ Ts\gl^,_,^ E Ts) 

1=1 

+ Pr ((yr[,],x|l],x^[,],2/[?],yrH,^7[:..,,_i]) ^ r,|(a;|l],x^[,],y[?],yrH,^?^f:..,,_i]) G T,)) (47) 

From the way the code is generated, we have 

Pr ((x|l],a:^[,], |/rH,^?[^..,,_i]) ^ Ts\gl^^_,^ e%)<e (48) 

The compression from y^^-^ to is a conditional version of a rate-distortion code. If 
R' > I{Yi; Yi|Xi), then, when n is sufficiently large, we have 

Pr ((yrH,x|l],x^[,],y[?],i/rH, <7[^,.,,_i]) ^ T,|(x[J],x^[,],y[?],t/rH,^?[^..,,_i]) G r,) < e (49) 

Thus, 

Pr{Ei) < 2Be (50) 
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Now we switch to the error event E2. 



Pr{E2 n El) 

= XI P(3^fi,...,B]>y"[i,..,B]>2;i[i,...,B],yfi,...,B]) 

(^"l,...,S] '^l[l,...,S] '^1[1,...,B] 'f [1,...,S]) 

X ^'^ (^2|(4,...,B],dr[i,...,i?],<[i,...,s],?/ri,...,B]) sent) 
< max ^ ^^(^2|(4,...,B],yr[i,...,B],a;?[i,...,s],l/"i,,..,s]) seiit) (51) 

(^ri,...,B]'^l[l,...,S] '^1(1,. ..,S] '2'[l,...,B] j^-^ 

From our proposed coding scheme, we note that the codebooks at both transmitter and 
relay have tree structures with B — 1 stages. A correct codeword x^-^ can be viewed 

as a path in the tree-structured codebook at the transmitter. Similarly, for the codeword 
^i[i B-i] relay. An error occurs when we diverge from the correct path at a certain 

stage in the tree. Thus, the error event E2 can be decomposed as 

u u 

(^w -^ii'5r[ii--'%-ii)=(^[ii--B-ii-?iii---%-ii) 

(^Di-%i)^(-Si-rw) 

(^fl] ) ■ ■ ■ ) ^[B] ) Vlll] ) ■ ■ ■ ) VllB] ) ^1[1] ) ■ ■ ■ ) ^1[B] ) y[l] ) ■ ■ ■ ) VlB]) £ ^ (52) 

where each term in the union in the above equation represents the error event that results 
when we diverge from the correct paths at the jth stage at the transmitter and at the kth 
stage at the relay. 

Let us define J^i to be the set consisting of all feasible codeword pairs (^^j) ^"[j]) for the 
jth block for a given and x^^y Then, we have 

Fi 4 1^,1 < Mexp(n(i/(fiy]|X^],Xi[,i) + 2e)) 

< Mexp(n(i/(yi[,-]|X[,-],Xi[,.]) + 26)) 



;i-6)exp(n(i7(%]|Xi[,])-26)) 
exp(n(/(fiy]; Y^y^lXiy]) + e)) 



l-e)exp(n(i/(Fi[,]|Xi[,.])-2e)) 
< Mexp(n(/(Fi[,];n[,.]|Xi[,.],X^.]) + 6e)) (53) 

We also define to be the set consisting of all feasible codewords xj^j for the jth block for 
a given Then, 

F2 = \J^2\ = M (54) 
Similarly, we define to be the set consisting of all feasible codewords y^^j for the jth block 
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for a given Xy^ and x^^j. Then, 

exp(n(i/(Fi[,]|Xi[,],Xy]) + 2e)) 



^3 = l^sl < L- 



(l-6)exp(n(i7(%]|Xi[,])-26)) 
< exp{n{I{Y,yy, F^,] |Xi[,], X[,]) + 6e)) (55) 



We define the error event E2jk 

E2jk = U 



f -n -n -n -n \ ( n n ~n -n \ 

'"=[i]'--'''[i-i]'''i[i]--'''i[fc~i]J-l''[i]'--'"=U-i]'^i[i]--'^i[fe-i]J 



(^fl] ) ■ ■ ■ 5 ^fs] 5 ^1(1] 5 • • • 5 ^"[B] ) ^1[1] ) ■ ■ ■ ) ) ; ■ • • 5 ^ ^ (56) 

Then, we have 

B-l B~l 

Pr{E2 n < 5^ 5^ Pr(E2,fc n E[) (57) 

j=2 k=2 

and 

Pr(^2ife n ED < \Ajk\ max _ Pi(xfi], . . . , xf^.i], ^rm, . . . , ^i^.i]) (58) 

'■■■'■^[s-1] 'yi[i] '•••'yi[s-ii ''^•^jfe 

where 

•Ajk 

codeword (xj^, . . . , ^^[1]' • • • ' ViiB-i]) '■ 

y^[i\^ . . . , xj^_-^], y^^p . . . , yi]j._^ I — I xj-^p . . . , (/^-^], . . . , 

(59) 

Pi (Xp] , . . . , X^B-l] 5 ^l^Il] ) ■ ■ ■ ) yi[B-l] ) 

= Pr((X[\], . . . , x"b]5 ^"[l]; • • • 5 ^"[B] ' ^1[1] ) • • • 5 ?/[!]) • • • ; ^ ^) (^0) 



iveu ^Xpj, . . . , Xj^], (/^j^p . . . , (/^^], x^p], . . . , x^jgp (/pp . . . , (/j^jy t is- 

In order to have the probabihty of such error events go to zero, we need the following 
conditions to hold. 

When j = k, from the structure of the block Markov code and (!53|) . we have 

lA-fcl =^r' <^''"''exp(n(P-j)(/(V'i[/];>lwl^iH,X[i]) + 6e)) (61) 
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and 



Pi{x"i], ■ ■ ■ ,xfB-i],yi[i], • • • 5 yi[B-i]) 

< exp{n{H{xl^-'\Y};^-'\x[ll,^\Yi;^\Y^^^^^ + 2e)) 

X exp(-n(/7(Xj^-^l,F/[^f ^UI{;-U]l^[.-i],^ibl) - 2e)) 
= exp{n{-I{xl^f'\Y};;-'\x[^j^^^;Yl^^^ (62) 

When j < k, we have 

\A,k\ = FfFf-' < M^-^ exp(n(5 - k){I{Y,^iy, Y^^i^ + 6e)) (63) 

and 

< exp(n(/7(X(^-^ F'tr^uifj+i] iC' -r'^' ^b-U' ^S) + 2^)) 

X exp(-n(if(X[^-^r[tf ^x|fJ^y|X[,_i],Xi[,]) - 2e)) 

= exp(n(-/(X[[^-^] , Yll-'^ , x|f,Vy ; F , , X[^] , yl^-^l , x|;;;.^y |X[,_i] , Xi[,] ) + 4e)) 

(64) 

When j > k, we have 

|^.,| = Fr^F/^-^' < exp(n(j - A:)(/(%]; r^,]|Xi^^^^ 

X Mf-'=exp(n(5 - A;)(/(Fi[,]; Fi[^]|Xi[^], X^^j) + 6e)) (65) 

and 

Pi (xj'y , . . . , ^^1] , . . . , y^B-i] ) 

< exp{n{H{xl^-'\Yi;i-'\x[^l,^\Yl^\Y^B],X[^^^ Xi[,]) + 2^)) 

X exp(-n(i7(Xj^-^l,f/[ff^UIf,Vi]|xf^f^Uiw) - 2e)) 
= ^Mn{-I{xl^~'\Yi;i;'\x[^l^^-,Yl^\ (66) 

Thus, when n is sufficiently large, using (l58l) and (|6Ti) through (!66l) . we have 

Pr(F2,fc n F^) < e, J, = 2, . . . , 5 - 1 (67) 

if the following conditions are satisfied: 
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1. For all j such that 1 < j < B - 1, 

(68) 

2. For all j, k such that l<j<k<B-l, 

-{B- j) InM + (S - fc)7(Yi[,]; Yi[^]|Xi[,], 

77/ 

(69) 

3. For all j, k such that l<k<j<B-l, 

{j - k)I{Y,[iY,Y,[i]\X^i],X[i]) + l(B-j)\nM + {B- j)I{Y,[iy,Y,[i]\X^i^, X^i^) 

< nxl^f'^,Yi;-\x[^^^,^;Yl^^^ (70) 

Therefore, we have 

Pe = Pr{Ei) + Pr{E2 n E'^) < {2B + B^)e (71) 
When n is sufficiently large, {2B + B'^)e can be made arbitrarily small. 
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A. 2 Lower Bounding the Mutual Informations in ([3]), ([4l), ([5]) 

For the right hand side of ([3]), we have 

[2] 

[3] - «-i 

SI ^ — > ^ [^^1] 

> 2^ /(y[i];X[;],ri[i],Xi[,]|X[j_i],Xiy],ryj ) 

i=j+i 

+ -^C^lBj; -^[B]7 \Xi[B], X[B-1], Xy^i], Xi[j],Yy^ ^ ) 

+ ^(^[B];^l[i?]7^[B-l]|^[i-l]7^1[i]7^[jf 

+ -^(^[B]; -^^[517 Yl[B], Xi[B], X[B-l]\X[j_i], Xi[j],Yl-^^ 

[5] ^ 

E] 

> (S - j)/(^[z];X[;],n[,],Xi[,]|X[,_2]7^i[^-i]7>^[^-i]) (72) 

where 

1. follows from the chain rule; 

2. because of Markov properties 1 and 2; 

3. because of the stationarity of the random process and the property that conditioning 
reduces entropy; 

4. because of Markov property 2; 

5. because of Markov property 1; 
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6. because of Markov property 2 and the stationarity of the random process. 
For the right hand side of (HI), we have 



fc— 1 



i=A:+l 

[2] ''"^ 



l=k+l 



+ I{XiB-i],XiiB];YiB], Yi[B],X[B] ^ly]) 
fc-i 

I 5^ /(XH;r[,],Fi[,]|Xy_i],r[;-^i,f4^Ufi]) 

i=fc+l 

+ (-^[B] ; YlB] , Yl [B] I X[B-1] , Xi [B] ) 

+ I{X[B],Yl[B];Y[B] |Xy_i+ij_fc], YjJ^^^l^j, >^iy+iLfe], ^ly'+B-fc]) 

-[B-1] ^[k-1] ylk] X 



lb] 

i=fc+i 



i=fc+i 

+ -^(-^[B]; '^[B], Yi[B]\XiiB], S) + FuB], Yj^ilS*) 

E] , . 

> {k - j)I{X[iY, Y[q, Fi[;]|Xi[i], Y[i_i], Xi[i_i], X[i_2]) 

+ (i? - k)I{YiiY, Xii],Yiii], Xm]\X[i_2], Xiii_i],Y[i_i]) 
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where 

Q^(Y, , y[B-M i^l^-y , vl^-l] vt'^-^l Yl'^h 

and 

1. follows from the chain rule; 

2. because of Markov properties 1 and 2; 

3. because of the stationarity of the random process; 

4. because of the following derivation 

H^[B] ; ^[B] , \X[B-1] , Xl[B] ) 

+ H^[B], Yl[B];YiB]\Xij_i+B~k], ^[J+iJ-fc]> ^ly+B-k]^ ^ly+B-k]) 

+ I{X[B-i] , ; Y[B] , , X[B] \X[j^i] , YjJ^"^^ , y/y^^' , ) 

> I{^[B] ; ^[B] , l-'^IB-l] , Xi[B] , S) + I{X[B] , Yi[B] ] ^[B] \Xi[B] , S) 
+ I{X[B-1], Xi[B];Y[B],Yi[B]\S) 

> I{XiB];Y[B],Yi[B]\XiB-i],Xi[B], S) + I{X[B],Yi[B];Y[B]\Xi[B], S) 

+ -^(-^[B-l]; ^B]- ^l[B]|-^l[B], 5*) + I{Xi[B]', YlB]\S) 
= H^[B]',YlB],Yi[B]\Xi[B], S) + I{XiB],Yi[B],Xi[B];YiB]\S) 

5. because of Markov property 1 and 2 and the stationarity of the random process. 
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For the right hand side of (j5]), we have 



where 



El ^ 

i=fc+i 

Z=fc+1 

l=j+l 

+ I{Y[B];X[B],YiiB],Xi[B] \xl^:^^l.^, Xi[k+B-j], Y[k+B-j]) 
+ IiY[B],Yi[B],XiB];Xyf'^ , Xi[B] \X^l^ ^ , Xi[k], Yi^'^ ) 



> 

/=fc+i 



+ 5^ /(r[,];X[,],fi[,],X^,]|xf^f^UiW,V'[ff^l) 

l=j+l 

+ H^iB] ', Yi [B] , Xi [B] I -^[j^' , 5") + / ; , Yi [B] , Xi [b] \ S') 

> (j - fc)j(y[,];Fi[,],Xi[^]|X[i],X[^_i],Xi[,_i],r[,_i]) 

+ {B - j)I{YiiY, X[i],Yi[i], Xiii]\X[i^2], Xi[i^i],Yii^i]) 



S' - {Xi[k+B-j],Yl^^ ^\Xi[k]) 



and 

1. follows from the chain rule; 

2. because of Markov properties 1 and 2; 

3. because of the stationarity of the random process; 
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4. because of the following derivation 



I{Y[B]; Yi[B]\X[B],Xi[B]) + I{Y[B];X[B],Yi[B],Xi[B] IXjf^^Lj], ^[l+B^lj]) 



> I{Y[B];YiiB]\X[B], XiiB], S') + I{Y[B];X[B],YiiB],XiiB]\Xy^ ^ ,5") 

+ I(Y[B] , Yi[B] , X[B] ; -^r'^ , Xi[B] I S") 




(78) 



5. because of Markov property 1 and 2 and the stationarity of the random process. 



First, we note that condition 1 is equivalent to the expression in Theorem [2l We also note 
that condition 2 is seemingly weaker than condition 1 because (136|) is implied by (!33l) and 
(IMll . and condition 3 is seemingly stronger than condition 2 because condition 3 consists 
of every element in condition 2 plus ( |38l) . Even though they seem different, these three 
conditions are indeed equivalent. The equivalence of conditions 2 and 3 is shown in [5]. 
Here, we use a similar proof technique to show the equivalence of conditions 1 and 2 as 
followJ§. For a given distribution p{x,xi,y,yi,yi), condition 1 is stronger than condition 
2, which means that an arbitrary rate R satisfying condition 1 will also satisfy condition 
2. Conversely, for a rate R satisfying condition 2, if ( IMl) is satisfied, then condition 1 is 
satisfied. If is not satisfied, i.e.. 



A.3 Proof of Theorem [3] 



/(Fi; FilXi) > /(Fi; Y\X^) + /(Xi; Y) 



(79) 



we know that R G [0, R*], where 



R* - IiX■Y^\X^) < IiX■Y\Y^,X,) 
R* - I{X; filXi) + /(Fi; fi|Xi) = /(X, Y,; Y\X,) + /(Xi; Y) 



(80) 
(81) 



similar result is given in [7] by means of time-sharing. 
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That is, R* is defined such that (136|) is satisfied with equahty. We may rewrite (IHOl) and (IHTi) 

as 



i?* </(X;F|Xi) + /(X;Fi|F,Xi) (82) 
i?* = /(X, Xi; Y) - I{Y^; Y,\X, Xi, Y) (83) 

We define a new random variable Y( such that Y( has the same marginal distribution as 
Yi and Y( Yi ^ (Yi,X, Xi,y). Due to the continuity of mutual information, there 
exists a choice of f/ such that /(X; F/IF, Xi) = A for any A e [0, /(X; Xi)]. If 
R* - J(X; F|Xi) > 0, we choose such that = /(X; Y\Xi) + /(X; y/|y,Xi). We note 
that, in this case, J(Fi; Fi|X, Xi, F) > /(Fi; F/|X, Xi, y). Thus, 

R* = /(X; F|Xi) + /(X; Xi) (84) 
i?* < /(X, Xi; Y) - I{Y,; f/|X, Xi, Y) (85) 

which means that R* satisfies condition 1 with joint distribution p{x, xi, y, yi,y'i) and so does 
any R < R*. If R* - I{X;Y\Xi) < 0, we choose F/ independent of (fi, X, Xi, Fi, F). In 
this case, 

R* < /(X; r |Xi) + /(X; Y;\Y, Xi) = /(X; Y\X,) (86) 
= /(Fi; f/|Xi) < I{YI; F|Xi) + I{X,; Y) (87) 

Therefore, in this case, R* satisfies condition 1 with joint distribution p{x,xi,y,yi,y'i) and 
so does any R < R*. 

As we mentioned above the equivalence between condition 2 and 3 is shown in [5]. For 
completeness, we restate their proof here as follows. For a given distribution p{x, xi,y, yi, yi), 
condition 3 is stronger than condition 2, which means that an arbitrary rate R satisfying 
condition 3 will also satisfy condition 2. Conversely, for a rate R satisfying condition 2, 
if fl38|) is satisfied, then condition 3 is satisfied. If (|38|) is not satisfied, i.e., the following 
inequalities are satisfied 



R - /(X; Y,\X,) < /(X; Y\Y^,X^) (88) 
/(Fi; FilXi, X) > /(Fi; F|Xi, X) + I{X^; F|X) (89) 
R - J(X; FilXi) + /(Fi; Fi|Xi) < /(X, Y,; Y\X,) + /(Xi; F) (90) 

then the following inequalities are satisfied also, since we simply drop the first inequality, 

/(Fi; FilXi, X) > /(Fi; F|Xi, X) + /(Xi; F|X) (91) 
R - /(X; FilXi) + /(Fi; Fi|Xi) < /(X, Y^, F|Xi) + I{X,; F) (92) 
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By combining (19T]) and (192|) . we have 



R <I{X; Y,\X,) - I{Y,- FilXi) + I{Y^- Y^\X^, X) 

+ /(X, fi; + /(Xi; F) - /(fi; F |Xi, X) - /(Xi; F |X) 

</(X; r |Xi) - (/(Xi; F|X) - /(Xi; F)) 

</(X;y|Xi) (93) 
which imphes condition 3, i.e., (1371) . (138!) and (!39|) . with Yi set to be a constant. 
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